11 research outputs found
Visualising Basins of Attraction for the Cross-Entropy and the Squared Error Neural Network Loss Functions
Quantification of the stationary points and the associated basins of
attraction of neural network loss surfaces is an important step towards a
better understanding of neural network loss surfaces at large. This work
proposes a novel method to visualise basins of attraction together with the
associated stationary points via gradient-based random sampling. The proposed
technique is used to perform an empirical study of the loss surfaces generated
by two different error metrics: quadratic loss and entropic loss. The empirical
observations confirm the theoretical hypothesis regarding the nature of neural
network attraction basins. Entropic loss is shown to exhibit stronger gradients
and fewer stationary points than quadratic loss, indicating that entropic loss
has a more searchable landscape. Quadratic loss is shown to be more resilient
to overfitting than entropic loss. Both losses are shown to exhibit local
minima, but the number of local minima is shown to decrease with an increase in
dimensionality. Thus, the proposed visualisation technique successfully
captures the local minima properties exhibited by the neural network loss
surfaces, and can be used for the purpose of fitness landscape analysis of
neural networks.Comment: Preprint submitted to the Neural Networks journa
Fitness Landscape Analysis of Feed-Forward Neural Networks
Neural network training is a highly non-convex optimisation problem with poorly understood properties. Due to the inherent high dimensionality, neural network search spaces cannot be intuitively visualised, thus other means to establish search space properties have to be employed. Fitness landscape analysis encompasses a selection of techniques designed to estimate the properties of a search landscape associated with an optimisation problem. Applied to neural network training, fitness landscape analysis can be used to establish a link between the properties of the error landscape and various neural network hyperparameters. This study applies fitness landscape analysis to investigate the influence of the search space boundaries, regularisation parameters, loss functions, activation functions, and feed-forward neural network architectures on the properties of the resulting error landscape. A novel gradient-based sampling technique is proposed, together with a novel method to quantify and visualise stationary points and the associated basins of attraction in neural network error landscapes.Thesis (PhD)--University of Pretoria, 2019.NRFComputer SciencePhDUnrestricte
Black-Box Saliency Map Generation Using Bayesian Optimisation
Saliency maps are often used in computer vision to provide intuitive
interpretations of what input regions a model has used to produce a specific
prediction. A number of approaches to saliency map generation are available,
but most require access to model parameters. This work proposes an approach for
saliency map generation for black-box models, where no access to model
parameters is available, using a Bayesian optimisation sampling method. The
approach aims to find the global salient image region responsible for a
particular (black-box) model's prediction. This is achieved by a sampling-based
approach to model perturbations that seeks to localise salient regions of an
image to the black-box model. Results show that the proposed approach to
saliency map generation outperforms grid-based perturbation approaches, and
performs similarly to gradient-based approaches which require access to model
parameters.Comment: Submitted to IJCNN 202
Genetic Micro-Programs for Automated Software Testing with Large Path Coverage
Ongoing progress in computational intelligence (CI) has led to an increased
desire to apply CI techniques for the purpose of improving software engineering
processes, particularly software testing. Existing state-of-the-art automated
software testing techniques focus on utilising search algorithms to discover
input values that achieve high execution path coverage. These algorithms are
trained on the same code that they intend to test, requiring instrumentation
and lengthy search times to test each software component. This paper outlines a
novel genetic programming framework, where the evolved solutions are not input
values, but micro-programs that can repeatedly generate input values to
efficiently explore a software component's input parameter domain. We also
argue that our approach can be generalised such as to be applied to many
different software systems, and is thus not specific to merely the particular
software component on which it was trained.Comment: A version of this paper has been accepted for publication in CEC'2
Empirical Loss Landscape Analysis of Neural Network Activation Functions
Activation functions play a significant role in neural network design by
enabling non-linearity. The choice of activation function was previously shown
to influence the properties of the resulting loss landscape. Understanding the
relationship between activation functions and loss landscape properties is
important for neural architecture and training algorithm design. This study
empirically investigates neural network loss landscapes associated with
hyperbolic tangent, rectified linear unit, and exponential linear unit
activation functions. Rectified linear unit is shown to yield the most convex
loss landscape, and exponential linear unit is shown to yield the least flat
loss landscape, and to exhibit superior generalisation performance. The
presence of wide and narrow valleys in the loss landscape is established for
all activation functions, and the narrow valleys are shown to correlate with
saturated neurons and implicitly regularised network configurations.Comment: Accepted for publication in Genetic and Evolutionary Computation
Conference Companion, July 15--19, 2023, Lisbon, Portuga
Cauchy Loss Function: Robustness Under Gaussian and Cauchy Noise
In supervised machine learning, the choice of loss function implicitly
assumes a particular noise distribution over the data. For example, the
frequently used mean squared error (MSE) loss assumes a Gaussian noise
distribution. The choice of loss function during training and testing affects
the performance of artificial neural networks (ANNs). It is known that MSE may
yield substandard performance in the presence of outliers. The Cauchy loss
function (CLF) assumes a Cauchy noise distribution, and is therefore
potentially better suited for data with outliers. This papers aims to determine
the extent of robustness and generalisability of the CLF as compared to MSE.
CLF and MSE are assessed on a few handcrafted regression problems, and a
real-world regression problem with artificially simulated outliers, in the
context of ANN training. CLF yielded results that were either comparable to or
better than the results yielded by MSE, with a few notable exceptions.Comment: A version of this paper was accepted for publication in SACAIR'2
Comparision Of Adversarial And Non-Adversarial LSTM Music Generative Models
Algorithmic music composition is a way of composing musical pieces with
minimal to no human intervention. While recurrent neural networks are
traditionally applied to many sequence-to-sequence prediction tasks, including
successful implementations of music composition, their standard supervised
learning approach based on input-to-output mapping leads to a lack of note
variety. These models can therefore be seen as potentially unsuitable for tasks
such as music generation. Generative adversarial networks learn the generative
distribution of data and lead to varied samples. This work implements and
compares adversarial and non-adversarial training of recurrent neural network
music composers on MIDI data. The resulting music samples are evaluated by
human listeners, their preferences recorded. The evaluation indicates that
adversarial training produces more aesthetically pleasing music.Comment: Submitted to a 2023 conference, 20 pages, 13 figure